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A HYBRID METHOD FOR THE 
OPTIMAL LINEAR CONTROL OF NONLINEAR SYSTEMS 


By Jay M. Lewallen 
SUMMARY 

The relative advantages and disadvantages of indirect 
and direct optimization methods have been known for some 
time. This development illustrates the compatibility between 
a perturbation method (indirect) and a gradient method (direct) . 
Moreover, the investigation reveals exactly how estimates 
of the initial Lagrange multipliers, needed for the indirect 
method, may be obtained from the direct method. After 
several iterations with the direct method, the initial value 
of the Lagrange multipliers may be improved to the point that 
convergence can be achieved with the indirect method. 

INTRODUCTION 

In recent years, considerable interest has been gener- 
ated in methods for solving the nonlinear two-point boundary 
value problem. Tnese methods are quite naturally categorized 
as either direct or indirect. The indirect methods seek 
to satisfy the conditions of mathematical optimality, that 
is, the necessary conditions resulting from the first 
variation of the functional to be extremized. The classical 
optimality conditions are satisfied identically, and iterations 
are continued until the desired terminal constraints are 
satisfied. The indirect methods that have been successfully 
implemented are those proposed by Breakwell et al^, 

Jazwinski ^ , Kenneth and McGill ^ , and Lewallen^. 
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On the other hand, the direct methods seek to improve 
an assumed control program, with the aid of influence 
function concepts , by changing the control program in such 
a manner that some index of performance is extremized and 
the desired terminal constraints are approached. The 
classical optimality conditions are not satisfied identically, 
and therefore the optimal trajectory is only approximated. 

The direct methods that have been successfully implemented 
are those proposed by Kelley^, Bryson and Denham^, 

1616y et al^ 7 ^ 8 ^, McReynolds ^ 9 ^ , and Gottlieb ^ 1 ° ^ . 

The major objection to the indirect methods is that 
the initially assumed Lagrange multipliers must be iterated 
upon, and often the first guess for their values is so poor 
that convergence is never achi' ved. However, if convergence 
occurs, it does so quadraticaliy and the optimal or Eulerian 
control program may be easily evaluated. The major objec- 
tion to the direct methods is that the Eulerian control is 
only approximated. Experience has revealed that even though 
the terminal constraints are adequately satisfied and the 
performance index is extremized, the approximated control 
program is often significantly different from the Eulerian. 
However, the convergence process will usually begin from 
almost any reasonable control program assumption. 

Clearly, a hybrid approach, combining the best features 
of the indirect and direct methods, would be of significant 
value. In the discussion that follows, the perturbation 
method proposed in Reference 1 is shown to be compatible with 
the gradient method proposed in Reference 6. In addition, a 
hybrid method is proposed which illustrates how initial values 
of Lagrange multipliers may be obtained for the indirect 
method by using the direct method. 
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NECESSARY CONDITIONS FOR AN OPTIMAL PROCESS 


In general, the optimal control process can be stated 
as follows: Determine the m- vector of control variables 

u(t) in the interval t < t < t r such that a scalar 

o — — f 

performance index of the form 

I = $(x f , t £ ) (1) 

is minimized, while the n-vector of initial conditions 

x(t ) = x (2) 

v o o 

c 

at a known t , and the q-vector of terminal conditions 
o 7 n 

M(x f , t f ) = 0 (3) 

at an unknown t £ , and the n first-order, nonlinear, 

differential equations 

x = f(x,u,t) (4) 

are satisfied. 

The necessary conditions required for the accomplish- 
ment of the above stated objective are discussed by Tapley 
and Lewallen ^ 1 1 ^ . These conditions may be summarized as 
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follows. In the interval of interest, 



x = H, f (x,u,A,t) 


H x (x,u,X ,t) 


0 = H u (x,u,A,t) 


At the unknown terminal time. 


M(x f , t f ) = 


N(x f> X f , n, t f ) = /p x - X T V 
r(x £ , a £ , n , t £ ) - (P £ + H ) £ 


= o 


The scalar functions P and H are defined as 


( 5 ) 


( 6 ) 


P = t f ) + n M(x^, t^-) 


H = A 1 f (x ,u, t) 


f» - fJ 


(7) 

( 8 ) 


where H is referred to as the generalized Hamiltonian, 
and n is a q-vector of Lagrange multipliers. 

With the indirect methods, it is usually assumed that 

a well defined minimum of H(x,u,A,t) exists so that 

H =0 and H is positive definite. With these 
u uu v 

assumptions, the condition = 0 yields m algebraic 
equations which can be used to eliminate the m control 
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variables in Equations ( 5 - a ) and (5-b) . The results can 
be expressed as 

x = H*(x,A,t) A = "H^(x,A , t) (9) 

where H = H [x,u (x ,X , t) ,X , t ] . Equations (2), (6) and (9) 
lead to a conventional two-point boundary-value problem. 

If the 2n-vectors z and F(z,t) are defined, 

z T = [x T a 1 ] f t = [h x : -h x ] do 

then Equations (9) can be expressed as the 2n-vector 

z = F (z , t) (11) 

Furthermore, Equations (6) define the terminal boundary 
conditions for Equation (11) to be 

h(z f , t f ) = 0 (12) 

where h is an n+q+1 vector. These n+q + 1 conditions 
may be used to determine the n values of X(t Q ) » the 
q values of n , and the one value of t^ . 

INDIRECT OPTIMIZATION METHOD 

The indirect optimization methods seek to satisfy the 
above stated necessary cond tions required for optimality, 
that is, Equations (2), (5), and (6). In implementing the 
perturbation methods, as proposed in References 1 and 2, 
Equations (2) and (5) are satisfied identically, and 
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iterations are continued until Equations (6) are satisfied. 

If convergence occurs, the satisfaction of these terminal 
constraints is usually approached quadrat ically . In 
implementing the quas i 1 ineari zat ion methods, as proposed 
in References 3 and 4, Equations (2) and (6) may be satis- 
fied identically, and iterations are continued on a linear- 
ized version of Equations (5) until the Equations (5) 
themselves are satisfied. In the following investigation, 
the only indirect method considered will be the perturbation 
method. 

The perturbation methods require a reference solution 
from which to begin. The equations that describe this 
reference trajectory are given in Equation (9). Since the 
initial state is given in Equation (2), a solution may be 
generated by assuming n initial Lagrange multipliers, X ( t o ) , 
and integrating Equations (9) forward. This integration is 
continued until some assumed terminal time is reached. An 
estimate of this terminal time may be made by determining the 
time when one of the specified constraints is satisfied 
identically. After the terminal time is reached and before 
the correction for the next iteration is made, q values of 
n must be determined. 

In addition to the above assumptions, some consideration 
must be given to the behavior of trajectories near the 
reference path. The study of these nearby trajectories 
require the perturbation of the equations that define a 
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reference solution, that is. Equations (5) must be 
perturbed. This operation leads to 


6x = H Ax 6x * H >u 6u 


6X = - H xx 6X ' H xu 6u ' H xX 6X 


0 ‘ H ux 6X + H uu 6u * H uX 6X 


where the coefficients must be evaluated on the reference 
path. These relations are simply a set of linearized equa- 
tions which describe Dossible perturbations to the reference 
traj ectcry . 

Equation (15) may be solved for the control variation 


Su = -H- 1 (H 6x + H , 6 X) 

uu ^ ux uX 


provided H^ u is nonsingular. This variation is eliminated 
from Equations (13) and (14) to provide 


where 


A i ; A 2 6x 

f ~ j - 

A3 i -A i 6 X 


Ai = H. - H, H * 1 H 

Xx Xu uu ux 


A 2 = -H. H" 1 H . 

2 Xu uu 11 X 


A, = -H + H H ~ 1 K 

XX XU uu ux 
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These 2n first-order differential equations may be used 

to describe the variations in the x and X histories 

due to control variations. In order to relate fix and 

6A at t r to the corrections 6\^ at dt , the properties 
f o o 

of a linear system of ordinary differential equations are 
used. Let $(t) be a 2n * n matrix whose columns are 
solutions of Equation (17) , hence 



where $ T (t ) = [C „ : I_ v _] . The matrix $(t) simply 
o nxn * n x n 

represents solutions that result when a unit perturbation is 
made in the unknown initial conditions. 

Since these perturbed trajectories are considered, allow- 
ances must be made for the perturbed path to miss the desired 
terminal constraints given in Equations (6). Hence, the 
reference terminal constraints are perturbed to yield 


dM = 


M fix + M dt 
x 


(19) 


dN = 


« ' P xx 


M x 


( K 


• |K J - P xx f 


=) 


P xt 1 dt 


= 0 


( 20 ) 


dR = r T 6X + (H x + P xt ) fix + dr, 


+ c»t * p tx f + p tt> dt 


= 0 


( 21 ) 
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where the first-order approximation d()=6()+()dt 
has been used to relate total changes in a quantity ( ) 
to variations in ( ) . During the initial iterations, the 
Equations (19), (20) and (21) are not necessarily satisfied; 
the associated dissatisfaction or error is denoted by 
evaluating M, N and R , respectively. Before N and R 
may be evaluated, however, a nominal value of n must be 
determined. This is accomplished by solving the first q 
of the n equations represented by N for the q values 
of n . 


To determine the n+q+1 corrections for X(t Q ), n, 
and t^ , Equations (18) , (19) , (20) and (21) may be com- 
bined to yield. 


dM 


M $ 

X X 

1 1 
1 0 1 

i-1 

■ i 

M 


1 i 
1-*° 1 

dN 

— 

- P $ 

T I 

! -m t i 

P - h t 


dn 



X xx x 

— -H - 

X X 


dR 



T 1 

p xt^x| M ti 

! i 

P + H 
t t 

f 

dtf 

_ - 


( 22 ) 


T 

where dN = [00* ••0 dN -••••dN ] . The first q elements 
1 q+1 n 

of this vector are set equal to zero because the first q 
equations of N have been satisfied identically to determine 
the nominal values of n . 


In problems where the constraints are relatively simple, 
the Lagrange multipliers n are eliminated from the start, 
that is, the terminal constraint M is not included in 
Equation (7). This requires the elimination of the 
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dependent variations in the transversal ity condition 


■ > T ) f * (-, • »), 


dt 


= 0 


( 23 ) 


that result from the first variation. This is accomplished 
by perturbing the desired terminal constraint relation, 
Equation (3) , to produce 



(24) 


Now, q of the dx^ and dt^ in Equation (24) are solved 
for in terms of the remaining n-q variations. These q 
variations are eliminated from the n+1 variations in 
Equation (23) leaving only n+l-q independent variations. 
The coefficients of the n+l-q variations are equated to 
zero, and along with the q relatiors in Equation (3), 
provide n+1 conditions for the n+1 values of X q and 
t^ . If the above-discussed n+1 terminal conditions 
become the elements of an n+1 vector h , the equation 
analogous to Equation (22) is seen to be 


[dh] 


r 

1 

6X 

h $ 
lx X 

i h * **J 

0 

dt^ 


(25) 


The computational procedure for the indirect optimi- 
zation method with the Lagrange multipliers eliminated from 
the start may be summarized as follows: 


(1) Integrate the 2n nonlinear differential equations 
of motion and the Euler-Lagrange equations, Equation (11), 
forward from t to an assumed t^ with starting 


10 



conditions consisting of the n known initial con- 
ditions satisfying Equation (2) and n assumed values 
for the unknown Lagrange multipliers. 

(2) Simultaneously with the above integration, inte- 
grate the 2n perturbation equation. Equation (17), 
with starting conditions described after Equation (18). 
The perturbation coefficients are formed from the 
variables that describe the reference trajectory. 

(3) Solve the n+1 linear algebraic equations, 
Equation (25), for a linear approximation of the 
corrections that must be applied to the assumed 
initial values of the Lagrange multipliers and the 
terminal time. 

(4) Apply these corrections and repeat the process 
until the corrections on the terminal norm become 
smaller than some preselected value. 

DIRECT OPTIMIZATION METHOD 


The objectives of the optimization problem are the 
same regardless of the method of solution. Hence, the 
direct optimization formulation involves the previously 
described equations. Equations (1) (2) (3) and (4). It 
is found convenient to partition the q-vector of terminal 
contraints M(x^, t^) =0 to 



^(x f , t f ) 
^ (*£ , t ^) 


(26) 
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where y is a q-1 vector and ft is a scalar stopping 
condition to be satisfied identically. Although use of this 
stopping condition is a convenient way to determine an approx- 
imate terminal time, one must use judgment to insure the 
selected terminal constraint will in fact be satisfied during 
initial iterations. 

If the differential equation, Equation (4), is linear- 
ized about some nominal path, the resulting equation becomes 

Sx = f 5x «■ f 6u (27) 

X 1-4 

where the partial derivatives f and f u are evaluated 
on the nominal path. The equation adjoint to Equation (27) 
is 

X = - f* X (28) 

where X is an n-vector of the adjoint variables. It 
should be noted that Equation (28) and Equation (5-b) are 
identical. Equation (28) may be combined with Equation (27) 
to yield 

^(X T 8x) = X T f u 5u (29) 

Integrating this equation and considering Equation (2) yields 

(X T 6x)^ = J* ^ X T f u 6u dt (30) 

t o 

which is designated the fundamental guidance equation. The 
object now is to determine how initial state variations and 
integrated control variations influence the performance 
index, stopping condition, and the terminal constraints. 
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If, on separate trials, the terminal values of the 
adjoint variables are set equal to 



- 

'a*! 

Nf 

- 

fa,! 

L 9x Jf 


"aft! 

3xj 

vector , X . 
* 

is 


( 31 ) 


X^ is an n vector. The desired relations are seen to be 


d4> = 


r 


^ xj f 6u dt + <£ dt r 
4> u Y f 


( 32 ) 


dil» = 


I' 


^ X y f Su dt + f dt, 
i|i u Y i 


( 33 ) 


dn = 


r 


^ xl f 6u dt + ft dt r 
ft u f 


( 34 ) 


where ( ) 


i ♦ |fLj and d() = [«() + (*) dt] f 


This formulation allows the specification of an allow- 
able step size to be taken in control space defined by 


dS 


■ f 


f j 6u T W 6u dt 


( 35 ) 
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where the step is a weighted quadratic function of the control 
deviation. The weighting matrix W is included to improve 
the convergence characteristics by giving more weight to 
regions of low sensitivity. However, unity is often chosen 
because of the lack of knowledge concerning this region. 

The criteria used for determining the best elements for 
this weighting matrix are not easy to determine and are 
usually found through trial and error procedures. 

The stopping condition, ft = 0 in Equation (26), is 
to be identically satisfied so dft in Equation (34) is 
equated to zero. The terminal time variation dt^ is 
eliminated from Equations (32) and (33) to yield 

d$ = J tf f u 5u dt (36) 

t o 
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where the terminal constraints and the control step are 

T 

adjoined by the use of the v and u Lagrange multipliers, 
respectively. It should be noted that the terminal constraints 
are adjoined in the same manner as in Equation (7) for the 
indirect method, and that the elements of v are just the 
first q-1 elements of the r\ vector of Equation (7). 

Since it is desired to determine the control variation 
which corresponds to the maximum change in the performance 
index, the first variation of Equation (38) must vanish; 
therefore 

« d « = f f f u - v T X^ a f u - u 6 u T wj S-u dt = 0 

(39) 

This implies that the desired control variation is 

6u = - W" 1 f T (X. n - .\ in v) (40) 


and when this equation is substituted back into Equations 
(35) and (37), the values of v and p are seen to be 


and 





(41) 


( 42 ) 
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where 


I W 


r* 


X T 0 f W' 1 f T X 0 dt 
f(] u u fill 


(43) 




J tf xf 0 f W" 1 f T X 0 dt 

j u u <t>w 


(44) 


1 = f tf xJ Q f W" 1 f T X 0 dt 

4>4> J ^ 4>il u u 

o 


(45) 


and I AiIi is a q-1 x q-1 matrix, 1^ is a q-1 vector, 


and 


and 



Now combining Equations (40) through (45) yields the 
desired control program 


6u ‘ t w " 1 


- X I " 1 I 1 
\ l>ll> 


ds • d, f T j;; d, f 


- i i i 

ipip 


V4 


+ W' 1 f T X , n IT) dtp 


(46) 


where the positive (negative) sign is used if <J> is to be 
maximized (minimized) . The previous control program is now 
modified by 


u = u , * + 6u . 
new old 


( 47 ) 
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The computational procedure for the direct optimization 
method may be summarized as fv.'lows: 

(1) Integrate the n differential equations of motion, 
Equation (1) forward, using an assumed control 
program and the desired initial conditions. Equa- 
tion (2). This integration is continued until the 
stopping condition, U = 0 in Equation (26), is 
satisfied. The state variable values are stored 

at each integration step. 

(2) Integrate the adjoint equation. Equation (28), back- 
ward q+1 times with the starting conditions, 

T 

Equation (31). The coefficient matrix -f is 
formed from the state variables stored during 
the forward integration. 

(3) Integrate the I equations. Equations (43) through 
(45) backward simultaneously with the adjoint 
equations using initial conditions of zero to 
yield values at t Q for 1^, 1^ , and 1^ . 

(4) Select a desired improvement in the terminal 
dissatisfaction dip = -ip for the next iteration. 

(5) Select a reasonable value for the mean square 
allowable control deviation, and from 

dS ■ 7 au ave (t f ' V 

determine an initial value of dS . 
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(6) Use the selected value of di/> and dS to calcu- 
late the numerator un^er the radical in Equation 
(46). If this quantity is negative, determine 
the dtj; that makes the quantity vanish. If it 
is positive, use the quantity as it is. 

(7) Calculate the 6u as given by Equation (46) and 
alter the assumed control program. The quantity 
dS must be decreased according to some selected 
criteria to prevent stepping across the optimal 
point into a nonoptimal region. 

(8) The procedure is continued until the control varia- 
tions are less than some preselected value. 

HYBRID OPTIMIZATION METHOD 

One objective of developing a hybrid method is to com- 
bine the best characteristics of the two existing methods. The 
two methods in this case have been developed above. The 
indirect method shown has excellent convergence character- 
istics, but determination of initial Lagrange multipliers 
is so critical that often the convergence process is never 
started. On the other hand, the direct method discussed 
will begin to converge from almost any initial guess on 
the control program. However, inherent with this method, 
convergence is never achieved and the classical optimality 
condition H u = 0 is never satisfied. The proposed hybrid 
method will show how good initial estimates of the Lagrange 
multipliers may be obtained for the indirect method from 
several iterations cf the direct method. 
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It is helpful at this point to review the necessary 
conditions for mathematical optimality and see how these 
conditions relate to the methods discussed. In summary, 
the necessary conditions are: 


I . 

• T 

x - 


Equation 

(5-a) 

II . 

X = -h t 

X 


Equation 

(5-b) 

Ill . 

0 = h t 

u 


Equat ion 

(5-c) 

IV. 

X <V = x o 


Equation 

(2) 

V. 

M(x f ,t f ) = 0 


Equation 

(6-a) 


a. iKx f ,t f ) = 0 


Equation 

(26) 


b. ft(x£,t£.) = 0 




VI . 

X T (t f ) = ♦ v T t x - ca x 


Equat ion 

(6-b) 

VII . 

X T f + + V 1 + tflj 

= 0 

Equat ion 

(6-c) 


T T 1 

where n = [v • s] . 

The following table will show which necessary conditions 
are forced to satisfaction and which conditions are used 
to iterate upon until adequate results are obtained for 
several different optimization methods. 
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Indirect Methods 

Direct Methods 


MPF MAF 

GNR MQM 

MSD MHS 

Forced to 
satisfaction 

1,11,111,1V 

III, IV, V, VI, 
VII 

I , II , IV, Vb 

Iterate on 
until adequate 
results are 
abtained. 

V, VI, VII 


I, II 

1 1 1 , V a 


where 


MPF - Met! of Perturbation Functions (see Reference 1 
and section titled Indirect Optimization Method) 

MAF - Method of Adjoint Functions (see Reference 2) 

GNR - Generalized Newton-Raphson (see Reference 3) 

MQM - Modified Quasilinearization Method (see Reference 4) 

MSD - Method of Steepest Descent (see References 5 and 6 
and section titled Direct Optimization Method) 


MHS - Min-H Strategy (see Reference 10) . 


Now, the two transversality conditions VI and VII may 
be combined to yxeld 


(*x + jT *x + ci! x) f + *t + vT *t * cn t| t _ 


0 


( A 8) 
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which may be rearranged to become 

♦ x* + $ t + v T 4» x f + v T ^» t ♦ c^ x f + = 0 (49) 

*f 

Recalling that } = 4> x f + <J> t , i = 4> x f * 

f II I * tf 

and ft = ft f + ft , Equation (49) may be written 

X L . 




Solving for the Lagrange multiplier associated with 
stopping condition yields 




This relation may be substituted back into VI. to provide 


x T Ct f) 


or by using the relations following Equation (37) 

\(tf,' = x ^( t f) + C t: f ) v (53) 

where v may be calculated from Equation (41) . 
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The computational procedure for the hybrid optimization 
method may be summarized as follows: 

(1) The direct method is applied for an iteration. 

(2) The value of v is determined from Equation (41) 
and Equation (53) if used to evaluate A(t^) . 

(3) Equation (28) is integrated backward from t^ to 
t Q providing A(t Q ) . 

(4) The iterations of the direct method are continued 
for some specified length of time. 

(5) The A(t ) obtained from the direct method is 

used as a starting condition in the indirect method. 

(6) If, after two iterations of the indirect method, 
the terminal norm is increased, continue iterations 
with the direct method to improve the estimates 

of X(t Q ) . 

(7) If, after two iterations of the indirect method, 
the terminal norm is decreased, continue iterations 
with the indirect method. 
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CONCLUSIONS 


This investigation has revealed how the Method of 
Perturbation Functions and the Method of Steepest Descent 
may be combined into a hybrid method. The resulting method 
has the advantage of having the best merits of both methods 
while some of the undesirable characteristics of both have 
been eliminated. Moreover, the hybrid is such that either 
method may be used individually, if desired. 
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